Search results for "Data structure alignment"

showing 3 items of 3 documents

SPECTR

2018

Modern high throughput sequencing platforms can produce large amounts of short read DNA data at low cost. Error correction is an important but time-consuming initial step when processing this data in order to improve the quality of downstream analyses. In this paper, we present a Scalable Parallel Error CorrecToR designed to improve the throughput of DNA error correction for Illumina reads on various parallel platforms. Our design is based on a k-spectrum approach where a Bloom filter is frequently probed as a key operation and is optimized towards AVX-512-based multi-core CPUs, Xeon Phi many-cores (both KNC and KNL), and heterogeneous compute clusters. A number of architecture-specific opt…

0301 basic medicine03 medical and health sciencesMulti-core processor030104 developmental biologySpeedupXeonComputer scienceData structure alignmentParallel computingError detection and correctionSupercomputerThroughput (business)Xeon PhiProceedings of the 47th International Conference on Parallel Processing
researchProduct

Lattice Boltzmann Simulations at Petascale on Multi-GPU Systems with Asynchronous Data Transfer and Strictly Enforced Memory Read Alignment

2015

The lattice Boltzmann method is a well-established numerical approach for complex fluid flow simulations. Recently general-purpose graphics processing units have become accessible as high-performance computing resources at large-scale. We report on implementing a lattice Boltzmann solver for multi-GPU systems that achieves 0.69 PFLOPS performance on 16384 GPUs. In addition to optimizing the data layout on the GPUs and eliminating the halo sites, we make use of the possibility to overlap data transfer between the host CPU and the device GPU with computing on the GPU. We simulate flow in porous media and measure both strong and weak scaling performance with the emphasis being on a large scale…

ta113ta114Computer scienceLattice Boltzmann methodsGPUParallel computingSolverLattice Boltzmannmemory alignmentComputational sciencePetascale computingAsynchronous communicationData structure alignmentGraphicsasynchronous communicationTitanHost (network)ComputingMethodologies_COMPUTERGRAPHICSData transmissionEuromicro international conference on parallel, distributed and network-based processing
researchProduct

Designing a graphics processing unit accelerated petaflop capable lattice Boltzmann solver: Read aligned data layouts and asynchronous communication

2016

The lattice Boltzmann method is a well-established numerical approach for complex fluid flow simulations. Recently, general-purpose graphics processing units (GPUs) have become available as high-performance computing resources at large scale. We report on designing and implementing a lattice Boltzmann solver for multi-GPU systems that achieves 1.79 PFLOPS performance on 16,384 GPUs. To achieve this performance, we introduce a GPU compatible version of the so-called bundle data layout and eliminate the halo sites in order to improve data access alignment. Furthermore, we make use of the possibility to overlap data transfer between the host central processing unit and the device GPU with com…

virtauslaskentalarge-scale I/OComputer scienceGraphics processing unitLattice Boltzmann methodscomputational fluid dynamicsParallel computinggraphics processing unit01 natural sciencesmemory alignmentprocessors010305 fluids & plasmasTheoretical Computer Science0103 physical sciencesData structure alignment0101 mathematicsGraphicsComputingMethodologies_COMPUTERGRAPHICSta113data layoutta114prosessoritSolverLattice Boltzmann010101 applied mathematicsData accessHardware and ArchitectureAsynchronous communicationCentral processing unitasynchronous communicationTitanSoftwareThe International Journal of High Performance Computing Applications
researchProduct